Registration has reached capacity. Join the waitlist

FedMECA: Scalable Federated Learning via Memory-Efficient and Concurrent Aggregation

Zhonghao Chen (University of Florida), Duo Zhang (University of California, Merced), Xiaoyi Lu (University of Florida)

System Optimization & Efficiency

FedMECA is a federated learning aggregation system that decouples model collection from aggregation to overcome scalability failures as client counts or model sizes grow. By enabling concurrent, memory-efficient aggregation, it makes federated training viable at scales where existing systems stall.

Presentation

Talk

Paper Session 6: Learning & Control

Thursday, May 28 · 3:50 PM – 4:00 PM

Bayshore Ballroom

Poster

Thursday, May 28 · 4:30 PM – 6:00 PM

Carmel

View day schedule

Abstract

Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy, but it faces growing scalability issues, causing FL to fail as the number of participating clients or model size increases. Existing aggregation paradigms usually overlook the memory and computational challenges arising from the tightly coupled processes of model collection and aggregation. Within such paradigms, aggregation necessitates waiting for all selected client updates, and the process is computationally demanding. To overcome these limitations, we propose FedMECA, a scalable, memory-efficient, and concurrency-aware aggregation framework for FL. FedMECA is designed to decouple model collection from aggregation, alleviating memory pressure on the central server by \boldsymbol36.57× and achieving up to \boldsymbol238.5× speedup in aggregation runtime without compromising model accuracy or convergence speed. FedMECA is designed with minimal system complexity and can support clients with heterogeneity and non-IID data. Moreover, our approach is easily extensible to aggregation strategies at different synchrony, offering flexibility and adaptability across diverse FL applications. These results demonstrate that FedMECA enables scalable and efficient training for modern large-scale FL workloads.

Artifacts & Links

Paper (ACM Digital Library)

                        Authors
                        Zhonghao Chen
University of Florida
Duo Zhang
University of California, Merced
Xiaoyi Lu
University of Florida